Selected Sense Enumerated Lexical Resources for Czech
نویسنده
چکیده
In this paper we present three quite different approaches to word senses description in three particular lexicons. The advantages and disadvantages of these approaches are mentioned. We have done some practical experiments with all of them. These experiments—including machine learning and manual annotation—are briefly described. At the end, we conclude by comparing those three lexicons.
منابع مشابه
Exploring and Extending Czech WordNet and VerbaLex
This paper presents usage of two major, linguist-made lexical resources of Czech language: WordNet and VerbaLex. First, a conversion to RDF was made. Afterwards, a Prolog program was used to analyse Czech language inputs. In the second part of the article an extension to current VerbaLex is proposed. Possible pitfalls are discussed. In the conclusion, we emphasize the side-effect of this work: ...
متن کاملVerb Valency Frames in Czech Legal Texts
This paper deals with valency frames for selected group of Czech verbs belonging to the domain of law. Starting with the lexical database VerbaLex we propose semantic roles for these verbs and formulate their Complex Valency Frames. The lexical database VerbaLex has been developed recently at the NLP Centre FI MU and contains approximately 10 500 Czech verbs. We integrate the proposed ’law’ val...
متن کاملSemi-automatic Acquisition of Lexical Resources and Grammars for Event Extraction in Bulgarian and Czech
In this paper we present a semi-automatic approach for acqusition of lexico-syntactic knowledge for event extraction in two Slavic languages, namely Bulgarian and Czech. The method uses several weaklysupervised and unsupervised algorithms, based on distributional semantics. Moreover, an intervention from a language expert is envisaged on different steps in the learning procedure, which increase...
متن کاملMerging Data Resources for Inflectional and Derivational Morphology in Czech
The paper deals with merging two complementary resources of morphological data previously existing for Czech, namely the inflectional dictionary MorfFlex CZ and the recently developed lexical network DeriNet. The MorfFlex CZ dictionary has been used by a morphological analyzer capable of analyzing/generating several million Czech word forms according to the rules of Czech inflection. The DeriNe...
متن کاملThe open lexical infrastructure of Spräkbanken
We present our ongoing work on Karp, Språkbanken’s (the Swedish Language Bank) open lexical infrastructure, which has two main functions: (1) to support the work on creating, curating, and integrating our various lexical resources; and (2) to publish daily versions of the resources, making them searchable and downloadable. An important requirement on the lexical infrastructure is also that we m...
متن کامل